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Evaluating  ML  Ns  for  Collective  Classification 


*.hip%  between  instance*  Because  many  datasets  can  K: 
modeled  this  way.  where  relationship**  can  he  modeled  as 
links  between  imbrecs,  Collective  Classification  has  wide 
applicability  and  can  outperform  more  traJitioaa!.  altnhitc- 
only  methods  |  1  1 1  Classification  of  page*  within  a  wehsilc 
is  a  clear  example  of  Collective  Classification.  Attribute* 
m.i>  include  features  like  the  presence  ot  certain  words 
while  hyperlinks  to  other  pages  would  provide  relational  at* 
tnhites  or  “links”  Classifying  a  university  department's 
website  may  include  labels  like  "Student  Page**,  "Instructor 
Page’*,  bourse  Page",  etc. 

Our  study  included  a  number  of  benchmark  algonthr&s. 
Thu  following  sections  mtnvJucc  these  algorithms  as  »tl 
as  our  technique*  with  Ml. Ns 

2.1  IWnvhimirk  Algorithms 

The  Ur  rutin  CUasificatiom  Al%onlhm  I  1CA>  is  a Coicc- 
U\x  Chtc&i  Heat  ion  algorithm  that  begin*  with  a  bootstrap- 
ping  process  which  computes  initial  predictions  using  only 
independent  attrihitcw  It  then  iteratively  recompute*  each 
node's  label  using  the  predicted  labels  of  neigh knag  ndo 
as  evidence  It  successively  recompute  v  the  unknown  L»KU 
for  a  set  number  of  iterations  In  this  form  of  the  algorithm. 
Ihe  contribution  ol*  relational  information  fw»m  cask  neigh- 
hiring  node  is  weighted  cc|ually  in  class  prediction  |I  1 1  In 
contrast,  in  the  “cautious"  variant  relational  information  i* 
used  more  heavily  if  the  neighboring  iudv'%  label  is  tumid- 
ered  more  reliable  1 10| 

Gibb*  so m/Vmg  is  .1  well-studied  M»»nte  Carlo  technique 
At  e.ich  stale,  it  samples  a  label  fcr  e.xh  r»xk  based  on  tkc 
current  predicted  label  distribution  Ihe  most  likely  Libel 
for  each  node  is  the  one  most  often  selected  The  tech¬ 
nique  is  shown  to  usually  haw  pierformansc  1 10|  lie 
this  study,  we  chose  ICA  and  Gibb*  as  re pre  wnt.it is c  of 
the  unon-caulimis"  and  “caution*"  classes  of  olgoridim*  de¬ 
scribed  by  McDowell  cl  al  1 10| 

Relational  Baytxian  Classifier  I RBCl  is  a  non-c oAWetive 
algorithm  It  represents  heterogeneous  data  m  such  a  man¬ 
ner  that  a  Simple  Bayesian  C  lassifier  may  be  used  to  learn 
conditional  probabilities  for  each  attribute  Our  implemen¬ 
tation  use*  a  naive  Bayes  classifier,  which  assume  w  condi¬ 
tional  independence  between  features 

The  next  algorithm.  MRW.  in  an  example  of  a  relatmnal- 
only  classifier.  It  considers  only  relational  information  Id 
label  instances.  Because  it  has  no  independent  features  Id 
initiate  the  hootstr.ipping  pnec  v.  like  Cribbs  and  ICA.  ins 
algorithm  requires  nsxles  with  krs^wn  groundings  fci  be  in¬ 
cluded  in  the  testing  act. 

The  Mulli-Bank  Walk  iMRWl  algorithm  simulates  ran¬ 
dom  walks  along  links  Irora  groundings  ot  each  class  and 
tallies  the  number  of  visits  each  node  receives.  Prediction* 
:ire  created  trom  the  relative  number  of  visits  exh  nxJc  re¬ 


ceives  from  the  candidate  classes.  It  outpKrrtunns  wvRN.  a 
well  biwn  relational -only  clasuHcr.  in  certain  situations, 
notably  when  the  proportion  of  Lnn*  n  labels  is  low  |H). 

2.2  MLNi 

A  Marian  L^ie  Network  lMI.Ni  weds  tint-order  logic 
wuh  a  statistically  learned  weight.  To  use  an  MI.N,  one 
must  first  create  a  set  of  tint  order  logic  rules  Weights 
^e  then  attached  to  these  nilcs  by  training  on  a  set  of  data. 
There  are  vanoi*  weight -Warning  techniques  available  al¬ 
though  the  Alchemy  toolkit  providr*  implementation  for 
on!)  generative  learning,  based  on  pseudo  log-likelihood. 
;md  discriminative  learning,  based  on  Voted  Perception, 
C  oomgate  Gradient.  and  Newlon’i  Method  With  a  set  of 
rules  wuh  attached  weights,  the  MI.N  may  be  used  to  infer 
prokihtUics  fin  statements. 

In  die  ease  of  Collective  ClassiHc.it ion.  links,  attributes, 
•u»J  labels  are  represented  in  data  sets  as  atomic  formu- 
Lw  Rules  will  take  the  form  of  Hist-order  logic  statements 
where  attribute  value*  imply  certain  labels  and  links  be¬ 
tween  o*kW*  imply  the  some  label  between  tliese  nodes  A 
weight-learning  technique  is  then  used  to  associate  weights 
with  these  implication  statement*  llicsc  learned  rules  moy 
thru  used  by  inference  algimthm*.  namely  Maikov  Chain 
Monte  C\*lo  IMCMO.  Maxuntiin-j-IWu  non  (MAP).  be¬ 
lief  propaga turn  1 14).  and  MC-SAT  |  !3|  to  infer  protahili* 
tscs  He  unified  n.dcv  It  should  k  noted  that  the  MCMC 
algieitim  is  the  umc  bauc  .ilgonthm  as  Cuhbs  except  that 
the  IcxjI  classifier  used  to  pexducc  n.dc  predictions  is  the 
MLN  instead  of  Naive  Bayes 

3  Methods 

3.1  l>ita  Generation 

Synthetic  data  puvxiuccd  with  the  Proximity  ttxdkil  pro- 
v  ided  ms  a  robust  source  ul*  material  on  whic  h  to  test  Hupei' 
iments  on  s>nthctic  data  included  two  training  sets  of  250 
nodes  and  a  single.  250  node  testing  set  In  all  experiments, 
we  used  10  independent  attributes  and  ran  10  trials  using 
the  same  default  idlings  k*r  data  feneration  as  McDowell 

ct  al  |101. 

Real  data  provided  by  the  CitcScer  dataset  pnivided  an¬ 
other  source  id*  teeing  material  Again,  we  gerx  rated  data 
uimg  tic  same  technique  as  Me  Dwell  ct  ol.  1 10).  a  5-fold 
cross  validation,  where b)  5  graphs  of  sj re  4CK)  were  gener¬ 
ated  with  4  used  fur  training  and  the  last  for  testing 

In  our  synthetic  datasets  we  pn manly  modified  the 
strength  of  the  links  os  a  predictor  uf  like  labels  l  the  “ho- 
moptlly  I.  the  strength  of  attn kites  as  a  predictor  uf  label 
•  the  “otinkitc  pecdic livcncss" I.  the  number  ol  possible  la¬ 
bels.  omd  finally  die  proportion  of  test  set  nudes  with  krxnvn 
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6  Related  Work 


Collective  Classification  h.r.  Keen  accomplished  with  a 
vancty  of  altfirithm  McDowell  ct  al  |10)  evaluated  a 
variant  of  ICA.  "Caution*  ICA".  which  exploits  nurc  cer¬ 
tain  relational  information  to  classify.  Inticducing  this  com¬ 
plexity  improved  the  fxrt  or  marine  of  ICA  in  m>*l  catcv 
When  tested  on  the  CiteSccr  database.  Cautious  ICA  gen- 
crally  outperformed  its  nun-cautious  rivals,  especially  w  ken 
using  lesser  non-relational  attributes. 

Dhurandhar  and  lh>bra  compared  the  p;rk>rmjrwc  of 
MI  .Ns  ag^unst  Rclaiion.il  Dependency  Network*  I  U  They, 
however,  only  utilized  MCMC  and  MAP  inference.  •*- 
mg  both  Generative  and  Discriminative  weight -learning 
Unlike  us,  they  found  that  the  chnce  of  weight- learning 
technique  anJ  inference  algorithm  did  not  qualitatively  af¬ 
fect  the  results.  In  particular,  they  found  that  Relational 
Dependency  Network*  with  Cubb's  performed  comparably 
to  Ml  Ns  Our  data,  however,  showed  consistent  undcr- 
pcrformoncc  of  MI  .Ns  compvirvd  in  Cubb*  Thee  use  of 
data  with  mainly  binary  class  Libels  may  explain  this  JdYer- 
encc,  as  our  results  indicate  that  binary  labels  ptovsJc  k* 
more  comparable  fx:rformancc  £>r  MI  Ns 

Maikov  Ia>gic  Networks  prxwidc  a  tool  with  a  diverse 
range  of  applications  Chechelha.  ct  al  |2|  utilize  MLN* 
to collectively  classify  entities  identified  in  images  Ksla- 
tional  information  is  ds  lined  as  attributes  shared  commonly 
between  entities  in  dilVercnt  picture*  Unlike  our  data.  this 
provides  multiple  link  lyf*%  and.  as  they  k>nnd.  complicate* 
classification  since  dilYcrent  links  may  vary  in  imp'iton**  kv 
classification.  They  too  used  the  Alchemy  Inolk*.  ahh-rngh 
they  only  used  a  single  set  of  weight -learning  and  inference 
sellings 

( )nc  area  of  Ml  .N  study  receiving  attention  in  the  weight 
le  arning  technique.  an  int Ratable  problem  with  several  can¬ 
didate  metis  xh  laiwd  and  Domingtw  f)\  c sphered  aher- 
nativcm  which  improve  upm  existing  tcchnupic*  by  uung 
second-order  mform.it ion  or  by  mdifying  the  learning  rale 
for  JitVerent  clauses.  Although  they  too  used  the  Alchemy 
tcxilkit.  implementing  their  techniques  as  externum,  they 
testsd  1*1  real  datasets  we  didn’t  use.  Cora  and  WcbKB.  ■  *- 
ing  the  Litter  for  Collective  Clowsiheation  The  atgvxitin* 
they  introduced  im  pit  wed  accuracy  over  their  metre  imple¬ 
mented  in  Alchemy  However,  they  dd  not  compare  Col¬ 
lective  Classification  accuracy  to  method*  outside  Mils*. 
Huynh  and  Mooney  |5J  mtrrducc  a  method  of  due  nm  na¬ 
tive  learning  based  on  a  max -margin  inmewtirk  which  can 
optimize  MI.Ns  for  collective  clossihcjlion  .v  curacy  Ihey 
tew  tested  on  the  CifecSccr  database  a*  well  a*  WcbKB  ard 
utilized  the  Alchemy  tc:olkit.  The  resulting  learner  provided 
equal  or  better  performance  than  the  cxLstmg  dnenmicutive 
learning  technique*. 

finally.  Markov  Logic  Network*  and  Collective  Cbsu- 


li cation  has  applications  outside  ot  the  networked  data  rep¬ 
resentations  we  used.  IXsmingo*  and  Richardson  |4|  dc- 
senbe  hnk  prediction,  link-  based  clustering,  uvial  network 
modeling,  ard  object  dentiii cation  in  an  MLN  framework. 
Riedel  and  Mexa-Rtu/  use  Ml_Ns  fix  natural  language  pm- 
cesung  taking  advantage  of  relational  aspects  of  ceman- 
tic*  1 15) 

7  Future  Work 

TV*  dopants  between  the  results  we  found  fur  real  .uul 
synthetic  data  raises  the  question  of  how  the  nature  of  the 
data  can  affect  performance.  Ike  data  we  fed  into  Alc  hemy, 
whether  real  it  synthetic,  had  similar  characteristic*  in 
term*  erf  hnnofhily  and  link  density.  Nevertheless  Ml  N 
pcrforrwarcc  was  far  aupxnor  with  the  real  data,  further 
wx*k  shmid  be  cine  into  how  the  nature  of  data,  especially 
real  data  can  affect  the  performance  of  M 1-N*. 

A*  men! nmcd  in  the  related  wx>ri  section,  other  meth¬ 
ods  uf  weight  learning  are  being  explored  anJ.  given  OUT 
ex  fere  ace  wxth  df.onatK  ally  dilVercnt  rt  suite  given  the  two 
methods  we  tested,  may  yield  fxrrformarKc  increases  that 
coaid  improve  current  performance  levels.  Given  the  com- 
fxautixu)  tone  rcqmred  by  tlie  discriminative  niethud.  n 
more  cHicient  learning  technique  that  matched  dncrimina- 
l iv v  levhraqaes  in  performance  wxmld  go  far  to  make  Ml  .N* 
a  more  viable  approach  ks  Collective  Classification 

l  i rully.  (sv  experience  witli  pc rfot  mane e  increases  with 
a  larger  training  sets  begs  more  axplnfal ion  into  the  amount 
of  data  needed  to  trail  Ml 

8  Conclusion 

A*  a  Co  Ice  live  Classification  tool,  we  found  unpre- 
dic table  rvvtdu  f»T  Ml  Ns.  it  is  mXohlc  t«x*  that  the  com- 
guUtson  tme  k>  learn  the  rule  weights  disc  run  matiwly  was 
signs  Ik  ant  and  dwarfed  the  t«*tal  tune  required  by  the  supe¬ 
rior  benchmark  algorithm*  I\t  the  synthetic  viita.  perfor¬ 
mance  was  conuaieatly  pocr  with  the  techniques  tested  and 
unreliable  xitus  different  data  types,  whether  altering  the 
number  of  labels  nr  labeled  proportion.  However,  given  the 
[xrfixraasc  dupxmty  between  Ik  real  and  synthetic  data, 
rt  i*  possible  that  the  nature  of  our  testing  data  affected  the 
(xrftxixaxc  where  cflhor  tyjvs  of  data  may  have  yielded 
better  results.  Future  wrrk  should  explore  these  elYccts  in 
■sure  detail 
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Table  13.  Discriminative  Learning  on  C  lie  Seer 
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